6 research outputs found
Hierarchical Topological Ordering with Conditional Independence Test for Limited Time Series
Learning directed acyclic graphs (DAGs) to identify causal relations
underlying observational data is crucial but also poses significant challenges.
Recently, topology-based methods have emerged as a two-step approach to
discovering DAGs by first learning the topological ordering of variables and
then eliminating redundant edges, while ensuring that the graph remains
acyclic. However, one limitation is that these methods would generate numerous
spurious edges that require subsequent pruning. To overcome this limitation, in
this paper, we propose an improvement to topology-based methods by introducing
limited time series data, consisting of only two cross-sectional records that
need not be adjacent in time and are subject to flexible timing. By
incorporating conditional instrumental variables as exogenous interventions, we
aim to identify descendant nodes for each variable. Following this line, we
propose a hierarchical topological ordering algorithm with conditional
independence test (HT-CIT), which enables the efficient learning of sparse DAGs
with a smaller search space compared to other popular approaches. The HT-CIT
algorithm greatly reduces the number of edges that need to be pruned. Empirical
results from synthetic and real-world datasets demonstrate the superiority of
the proposed HT-CIT algorithm
Edge-Cloud Polarization and Collaboration: A Comprehensive Survey for AI
Influenced by the great success of deep learning via cloud computing and the
rapid development of edge chips, research in artificial intelligence (AI) has
shifted to both of the computing paradigms, i.e., cloud computing and edge
computing. In recent years, we have witnessed significant progress in
developing more advanced AI models on cloud servers that surpass traditional
deep learning models owing to model innovations (e.g., Transformers, Pretrained
families), explosion of training data and soaring computing capabilities.
However, edge computing, especially edge and cloud collaborative computing, are
still in its infancy to announce their success due to the resource-constrained
IoT scenarios with very limited algorithms deployed. In this survey, we conduct
a systematic review for both cloud and edge AI. Specifically, we are the first
to set up the collaborative learning mechanism for cloud and edge modeling with
a thorough review of the architectures that enable such mechanism. We also
discuss potentials and practical experiences of some on-going advanced edge AI
topics including pretraining models, graph neural networks and reinforcement
learning. Finally, we discuss the promising directions and challenges in this
field.Comment: 20 pages, Transactions on Knowledge and Data Engineerin
Learning Instrumental Variable from Data Fusion for Treatment Effect Estimation
The advent of the big data era brought new opportunities and challenges to draw treatment effect in data fusion, that is, a mixed dataset collected from multiple sources (each source with an independent treatment assignment mechanism). Due to possibly omitted source labels and unmeasured confounders, traditional methods cannot estimate individual treatment assignment probability and infer treatment effect effectively. Therefore, we propose to reconstruct the source label and model it as a Group Instrumental Variable (GIV) to implement IV-based Regression for treatment effect estimation. In this paper, we conceptualize this line of thought and develop a unified framework (Meta-EM) to (1) map the raw data into a representation space to construct Linear Mixed Models for the assigned treatment variable; (2) estimate the distribution differences and model the GIV for the different treatment assignment mechanisms; and (3) adopt an alternating training strategy to iteratively optimize the representations and the joint distribution to model GIV for IV regression. Empirical results demonstrate the advantages of our Meta-EM compared with state-of-the-art methods. The project page with the code and the Supplementary materials is available at https://github.com/causal-machine-learning-lab/meta-em
Integrative analysis of differential genes and identification of a “2‐gene score” associated with survival in esophageal squamous cell carcinoma
Background Developments in high‐throughput genomic technologies have led to improved understanding of the molecular underpinnings of esophageal squamous cell carcinoma (ESCC). However, there is currently no model that combines the clinical features and gene expression signatures to predict outcomes. Methods We obtained data from the GSE53625 database of Chinese ESCC patients who had undergone surgical treatment. The R packages, Limma and WGCNA, were used to identify and construct a co‐expression network of differentially expressed genes, respectively. The Cox regression model was used, and a nomogram prediction model was constructed. Results A total of 3654 differentially expressed genes were identified. Bioinformatics enrichment analysis was conducted. Multivariate analysis of the clinical cohort revealed that age and adjuvant therapy were independent factors for survival, and these were entered into the clinical nomogram. After integrating the gene expression profiles, we identified a “2‐gene score” associated with overall survival. The combinational model is composed of clinical data and gene expression profiles. The C‐index of the combined nomogram for predicting survival was statistically higher than the clinical nomogram. The calibration curve revealed that the combined nomogram and actual observation showed better prediction accuracy than the clinical nomogram alone. Conclusions The integration of gene expression signatures and clinical variables produced a predictive model for ESCC that performed better than those based exclusively on clinical variables. This approach may provide a novel prediction model for ESCC patients after surgery